AITopics | aadirupa saha

Collaborating Authors

aadirupa saha

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

DP-Dueling: Learning from Preference Feedback without Compromising User Privacy

Saha, Aadirupa, Asi, Hilal

arXiv.org Artificial IntelligenceMar-22-2024

Research has indicated that it is often more convenient, faster, and cost-effective to gather feedback in a relative manner rather than using absolute ratings [31, 40]. To illustrate, when assessing an individual's preference between two items, such as A and B, it is often easier for respondents to answer preference-oriented queries like "Which item do you prefer, A or B?" instead of requesting to rate items A and B on a scale ranging from 0 to 10. From the perspective of a system designer, leveraging this user preference data can significantly enhance system performance, especially when this data can be collected in a relative and online fashion. This applies to various real-world scenarios, including recommendation systems, crowd-sourcing platforms, training bots, multiplayer games, search engine optimization, online retail, and more. In many practical situations, particularly when human preferences are gathered online, such as designing surveys, expert reviews, product selection, search engine optimization, recommender systems, multiplayer game rankings, and even broader reinforcement learning problems with complex reward structures, it's often easier to elicit preference feedback instead of relying on absolute ratings or rewards. Because of its broad utility and the simplicity of gathering data using relative feedback, learning from preferences has become highly popular in the machine learning community. It has been extensively studied over the past decade under the name "Dueling-Bandits" (DB) in the literature. This framework is an extension of the traditional multi-armed bandit (MAB) setting, as described in [4]. In the DB framework, the goal is to identify a set of'good' options from a fixed decision

aadirupa saha, algorithm, international conference, (13 more...)

arXiv.org Artificial Intelligence

2403.15045

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Industry:

Information Technology > Security & Privacy (0.67)
Leisure & Entertainment > Games (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.88)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)
(2 more...)

Add feedback

One Arrow, Two Kills: An Unified Framework for Achieving Optimal Regret Guarantees in Sleeping Bandits

Gaillard, Pierre, Saha, Aadirupa, Dan, Soham

arXiv.org Artificial IntelligenceOct-26-2022

We address the problem of \emph{`Internal Regret'} in \emph{Sleeping Bandits} in the fully adversarial setup, as well as draw connections between different existing notions of sleeping regrets in the multiarmed bandits (MAB) literature and consequently analyze the implications: Our first contribution is to propose the new notion of \emph{Internal Regret} for sleeping MAB. We then proposed an algorithm that yields sublinear regret in that measure, even for a completely adversarial sequence of losses and availabilities. We further show that a low sleeping internal regret always implies a low external regret, and as well as a low policy regret for iid sequence of losses. The main contribution of this work precisely lies in unifying different notions of existing regret in sleeping bandits and understand the implication of one to another. Finally, we also extend our results to the setting of \emph{Dueling Bandits} (DB)--a preference feedback variant of MAB, and proposed a reduction to MAB idea to design a low regret algorithm for sleeping dueling bandits with stochastic preferences and adversarial availabilities. The efficacy of our algorithms is justified through empirical evaluations.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2210.14998

Country:

Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
North America > United States > Pennsylvania (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Leisure & Entertainment (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Game Theory (0.93)
Information Technology > Data Science > Data Mining > Big Data (0.47)

Add feedback